158 research outputs found
Weakly-supervised localization of diabetic retinopathy lesions in retinal fundus images
Convolutional neural networks (CNNs) show impressive performance for image
classification and detection, extending heavily to the medical image domain.
Nevertheless, medical experts are sceptical in these predictions as the
nonlinear multilayer structure resulting in a classification outcome is not
directly graspable. Recently, approaches have been shown which help the user to
understand the discriminative regions within an image which are decisive for
the CNN to conclude to a certain class. Although these approaches could help to
build trust in the CNNs predictions, they are only slightly shown to work with
medical image data which often poses a challenge as the decision for a class
relies on different lesion areas scattered around the entire image. Using the
DiaretDB1 dataset, we show that on retina images different lesion areas
fundamental for diabetic retinopathy are detected on an image level with high
accuracy, comparable or exceeding supervised methods. On lesion level, we
achieve few false positives with high sensitivity, though, the network is
solely trained on image-level labels which do not include information about
existing lesions. Classifying between diseased and healthy images, we achieve
an AUC of 0.954 on the DiaretDB1.Comment: Accepted in Proc. IEEE International Conference on Image Processing
(ICIP), 201
Focusing computational visual attention in multi-modal human-robot interaction
Identifying verbally and non-verbally referred-to objects is an im-portant aspect of human-robot interaction. Most importantly, it is essential to achieve a joint focus of attention and, thus, a natural interaction behavior. In this contribution, we introduce a saliency-based model that reflects how multi-modal referring acts influence the visual search, i.e. the task to find a specific object in a scene. Therefore, we combine positional information obtained from point-ing gestures with contextual knowledge about the visual appear-ance of the referred-to object obtained from language. The avail-able information is then integrated into a biologically-motivated saliency model that forms the basis for visual search. We prove the feasibility of the proposed approach by presenting the results of an experimental evaluation
- …